perm filename HAW.DOC[AM,DBL] blob
sn#275333 filedate 1977-04-09 generic text, type T, neo UTF8
00001 Here is my paper for the Rule-based Inference conference. The format
00006 is close to -- but not absolutely identical to -- the one recommended
00011 in Waterman and Hayes-Roth's memo.
00016
00021 To denote ITALICS, we use the notation &i[I am italicised!!].
00026 To denote underlining, we say &u[I'm underlined.].
00031
00036 That is, ampersand, followed by "i" or "u", followed by the text in
00041 square brackets. Please note that this paper appears very long
00046 because it is in fixed-width font; the XGP version is of course much
00051 more compact. The name of this file is HAW.DOC[c380dl22], on
00056 CMU-10a.
00061
00066 I am in the process of finalizing the design of the new discovery
00071 program (the successor to AM), so any feedback which you have to the
00076 paper would be very welcome.
00081
00086
00091 -- Doug Lenat
00100 Dear Don and Rick,
00200
00300 Here are the format conventions for this paper:
00400
00500
00600 Italics is indicated as follows: @i[some italicised text]
00700 Underlining is thus indicated: @u[this is underlined]
00800 References are in single brackets: see Waterman[25] for details
00900 Temporary switching to un-italicised font is occasionally also used:
01000 @i[here in an italicised line is one @r[unitalicised] word]
01100 Index entries are superscripted and in brackets: [%%249]
01200 Footnotes are superscripted: @r[3]
01300
01400
01500 Titles and subtitles are marked as if they were italicised, but feel
01600 free to use various other, bolder, bigger fonts for them.
01700
01800 The final reference (to Waterman, reference number 25) may not be
01900 complete. Please try to fill in any more details you can on it. All
02000 other references are complete. No further changes need be made.
02100 Almost all the reviewer's suggestions have been followed.
02200
02300 Can't wait until the conference!!
02400
02500 Regards,
02600 Doug and Greg
00100
00200
00300
00400 @i[Designing a Rule System That
00500 Searches for Scientific Discoveries]
00600
00700
00800
00900
01000
01100
01200 @i[Douglas B. Lenat
01300 and
01400 Gregory Harris]
01500
01600
01700 @i[Computer Science Department
01800 Carnegie-Mellon University
01900 Pittsburgh, Pa. 15213]
02000
02100
02200
02300
02400
02500
02600
02700
02800
02900
03000 DRAFT (@i[April 8, 1977])
03100
03200
03300
03400
03500
03600
03700
03800
03900
04000
04100
04200
04300
04400
04500
04600
04700
04800
04900
05000
05100 This work was supported in part by the Defense Advanced Research
05200 Projects Agency (F44620-73-C-0074) and monitored by the Air Force
05300 Office of Scientific Research.
00100 Lenat & Harris p. i
00200
00300
00400 @i[1. The Basic Argument] 1
00500
00600 @i[2. Early Design Constraints] 4
00700
00800 [%%190]
00900 @i[3. @u['AM']: A Rule System For Math Theory Formation ] 6
01000 @i[3.1.] Discovery in Mathematics as Heuristic Rule-Guided Search 7
01100 [%%165]
01200 @i[3.2.] Representation of Mathematical Knowledge 8
01300 @i[3.3.] Top-level Control: An Agenda of Promising Questions 9
01400 @i[3.4.] Low-level Control: A Lattice of Heuristic Rules 13
01500 @i[3.5.] Behavior of this Rule System 15
01600
01700 @i[4. Reexamining the Design] 19
01800 @i[4.1.] Data Structures 19
01900 [%%225]
02000 @i[4.2.] Rules 25
02100 @i[4.3.] Distribution of Knowledge Between Rules and DS 31
02200 @i[4.4.] Interpreter 35
02300
02400 @i[5. Speculations for a New Discovery System] 36
02500 @i[5.1.] A New Set of Design Constraints 37
00100
00200
00300
00400
00500
00600
00700
00800
00900
01000
01100 @i[ABSTRACT]
01200
01300
01400
01500
01600
01700
01800
01900
02000
02100
02200
02300
02400
02500 Some scientific inference tasks (including mass spectrum
02600 identification [Dendral], medical diagnosis [Mycin], and math
02700 theory development [AM]) have been successfully modelled as rule-
02800 directed search processes. These rule systems are designed quite
02900 differently from "pure production systems". By concentrating upon
03000 the design of one program (AM), we shall show how 13 kinds of
03100 design deviations arise from @i[(i)] the level of sophistication of
03200 the task that the system is designed to perform, @i[(ii)] the
03300 inherent nature of the task, and @i[(iii)] the designer's @i[view]
03400 of the task. The limitations of AM suggest even more radical
03500 departures from traditional rule system architecture. All these
03600 modifications are then collected into a new, complicated set of
03700 constraints on the form of the data structures, the rules, the
03800 interpreter, and the distribution of knowledge between rules and
03900 data structures. These new policies sacrifice uniformity in the
04000 interests of clarity, efficiency and power derivable from a
04100 thorough characterization of the task. Rule systems whose
04200 architectures conform to the new design principles will be more
04300 awkward for many tasks than would "pure" systems. Nevertheless,
04400 the new architecture should be significantly more powerful and
04500 natural for building rule systems that do scientific discovery
04600 tasks.
00100 Lenat & Harris p. 1
00200
00300
00400 1. The Basic Argument
00500
00600
00700 Although rule-based computation was originally used for formal and systems
00800
00900 purposes [Post,Markov,Floyd], researchers in Artificial Intelligence (AI) found
01000
01100 that the same methodology was also useful for modelling a wide variety of
01200
01300 [%%161]
01400 sophisticated tasks. Many of these early AI rule-based programs -- called
01500
01600 [%%220]
01700 "production systems" -- served as information processing models of humans
01800
01900 performing cognitive tasks in several domains (digit recall [19], algebra word
02000
02100 [%%249]
02200 problem solving [1], poker playing [23], etc. [16,18]).
02300
02400
02500 [%%260]
02600 There were many design constraints present in the classical formal rule
02700
02800 based systems. Many of these details were preserved in the AI production
02900
03000 [%%219]
03100 rule based programs (e.g., forcing all state information into a single
03200
03300 string of tokens). But there were many changes. The whole notion of "what a rule
03400
03500 system really is" changed from an effective problem statement to a tendency to
03600
03700 solve problems in a particular way. One typical corollary of this change of
03800
03900 view was that instead of @i[no] external inputs whatsoever, there was now a
04000
04100 @i[presumption] of some "environment" which supplied new entries into the token
04200
04300 sequence. In the next section (see Figure 1) is an articulation of these
04400
04500 [%%223]
04600 @i[neo]-classical (i.e., AI circa 1973; see [7]) principles for designing
04700
04800 "pure" production systems.
04900
05000
05100 Due to the early successes, psychological applicability, and aesthetic
00100 Lenat & Harris p. 2
00200
00300
00400 simplicity afforded by production systems, AI researchers began to write rule
00500
00600 [%%147]
00700 systems (RSs) to perform informal inductive inference tasks (mass
00800
00900 spectrum identification [4], medical diagnosis [23] and consultation dialogue
01000
01100 [6], speech understanding [14], non-resolution theorem proving [0], math
01200
01300 research [13], and many more).
01400
01500
01600 Yet it seems that most of the large, successful RSs have violated many of the
01700
01800 "pure production system" guidelines. The purpose of this paper is to show that
01900
02000 such "exceptions" were inevitable, because any system satisfying the neo-
02100
02200 classical design constraints, though universal in principle, is too impoverished
02300
02400 to represent complex tasks for what they are.
02500
02600
02700 [%%23]
02800 The essence of the neo-classical architecture is to opt for simplicity in
02900
03000 all things, since there is very little one can say about RSs in general. As
03100
03200 more becomes known about the task of the RS, it turns out that some of that new
03300
03400 knowledge takes the form of specific constraints on the design of the RS itself
03500
03600 (as distinct from what specific knowledge we choose to represent within that
03700
03800 design). Sometimes a new constraint directly contradicts the early, domain-
03900
04000 independent one; sometimes it is merely a softening or augmentation of the old
04100
04200 constraint.
04300
04400
04500 After examining the "pure" architecture, we shall examine in detail the design
04600
04700 of one particular rule system which discovers and studies mathematical concepts.
04800
04900 Deviations from the pure architecture will be both frequent and extreme.
00100 Lenat & Harris p. 3
00200
00300
00400 Subsequent sections will analyze these differences. It will be shown that each
00500
00600 one is plausible -- usually for reasons which depend strongly on the "scientific
00700
00800 discovery" domain of the RS. Some of the limitations of this RS will be treated,
00900
01000 and their elimination will be seen to require abandoning still more of the
01100
01200 original design constraints.
01300
01400
01500 When these modifications are collected, in the final section, we shall have
01600
01700 quite a different set of principles for building RSs. Not only will naivete
01800
01900 have been lost: so will generality (the breadth of kinds of knowledge
02000
02100 representable, the totality of tractable tasks). Rule systems conforming to the
02200
02300 new design will be awkward for many tasks (just as a sledge hammer is awkward
02400
02500 for cracking eggs). However, they should be significantly more powerful and
02600
02700 natural for scientific inference tasks.
00100 Lenat & Harris p. 4
00200
00300
00400 2. Early Design Constraints
00500
00600
00700 By a @i[rule system] (RS) we shall mean any collection of condition-action
00800
00900 @i[rules], together with associated @i[data structures] (DS; also called
01000
01100 @i[memories]) which the rules may inspect and alter. There must also be a
01200
01300 [%%221]
01400 policy for @i[interpretation]: detecting and firing relevant rules.
01500
01600
01700 These definitions are deliberately left vague. Many details must be specified
01800
01900 for any actual rule system (e.g., What may appear in the condition part of a
02000
02100 [%%260]
02200 rule?). This specification process is what we mean by @i[designing] a RS.
02300
02400
02500 [%%54]
02600 Figure 1 contains an articulation of the design of the early general-
02700
02800 purpose AI production rule systems. Notice the common theme: the adequacy of
02900
03000 simplicity in all dimensions.
03100
03200
03300 [%%23]
03400 FIGURE 1: Neo-classical Rule System Architecture
03500
03600 @r[1. ]Principle of Simple Memories. One or two uniform data
03700 [%%142]
03800 structures define sufficient memories for a rule system to
03900 [%%101]
04000 read from and write into. The format for entries in these
04100 structures is both uncomplicated and unchanging.
04200
04300 [%%74]
04400 @r[2. ]Principle of Simple DS Accesses. The primitive read and
04500 write operations are as simple and low-level as possible;
04600 typically they are simply a membership-test type of read, and an
04700 insert-new-element type of write. More complicated, algorithmic
04800 operations on the memories are not available to the rules.
04900
05000 @r[3. ]Principle of Isolated DS Elements. Elements of the uniform
05100 DS cannot point to (parts of) other elements. This follows from
05200 the preceding principle: if we aren't allowed to @u[chase]
05300 pointers, there may as well not be any.
00100 Lenat & Harris p. 5
00200
00300
00400 @r[4. ]Principle of Continuous Attention. In addition to the one or
00500 two simple data structures, there may be an external environment
00600 which continuously inserts stimuli into the DS. The interleaving
00700 of stimuli and internally generated symbols is managed quite
00800 trivially: (a) The stimuli are simply inserted into the DS as new
00900 elements; (b) Each rule is so small and quick that no
01000 [%%159]
01100 "interruption" mechanism is necessary. The interpreter may
01200 ignore any suddenly-added stimulus until the current rule finishes
01300 executing. The RS may be viewed as "continuously" attending to
01400 the environment.
01500
01600 @r[5. ]Principle of Opaque Rules. Rules need not have a format
01700 inspectable by other rules, but rather can be coded in whatever
01800 way is convenient for the programmer and the rule interpreter;
01900 i.e., the set of rules is @u[not] treated as one of the RSs data
02000 structures. E.g., the condition parts of rules may be barred from
02100 fully analyzing the set of productions [22], and the action parts
02200 of rules may not be allowed to delete existing rules [24].
02300
02400 @r[6. ]Principle of Simple Rules. Rules consist of a left- and a
02500 right-hand side which are quite elementary: The left hand side
02600 (lhs, situation characterization, IF-part, condition) is typically
02700 [%%64]
02800 a pattern-match composed with a primitive DS read access,
02900 [%%5]
03000 and the right hand side (rhs, consequence, THEN-part, action)
03100 is also simply a primitive DS write access. There is no need for
03200 sophisticated bundles of DS accesses on either side of a rule.
03300 Thus several extra rules should be preferred to a single rule with
03400 several actions.
03500
03600 @r[7. ]Principle of Encoding by Coupled Rules. A collection of
03700 interrelated rules is used to accomplish each subtask; i.e.,
03800 wherever a subroutine would be used in a procedural programming
03900 language. For example, programming an iteration may require many
04000 rules "coupled" by writing and reading special (i.e., otherwise
04100 meaningless) loop control notes in the data structure.
04200
04300 @r[8. ]Principle of Knowledge as Rules. All knowledge of substance
04400 should be, can be, and is represented as rules. This includes all
04500 non-trivial domain-dependent information. The role of the DS is
04600 just to hold simple descriptive information, intermediate control
04700 state messages, recent stimuli from the environment, etc.
04800
04900 @r[9. ]Principle of Simple Interpretation. The topmost control flow
05000 [%%159]
05100 in the RS is via a simple rule interpreter. After a rule
05200 fires, it is essential that any rule in the system may potentially
00100 Lenat & Harris p. 6
00200
00300
00400 be the next one to fire (i.e., it is forbidden to locate a set of
00500 relevant rules and fire them off in sequence). When the rhs of a
00600 rule is executed, it can (and frequently will) drastically alter
00700 the situation that determined which rules were relevant.
00800
00900 @r[10. ]Principle of Closure. The representations allowed by (1-9)
01000 are sufficient and appropriate for organizing all the kinds of
01100 knowledge needed for tasks for which a given RS is designed.
01200
01300
01400
01500 This design was plausible @i[a priori], and worked quite well for its initial
01600
01700 applications (the simulation of simple human cognitive processes [16,19,24]).
01800
01900 But is this design proper for @i[any] RS, regardless of its intended task? In
02000
02100 particular, what about scientific inference tasks? Over the years, several
02200
02300 rule-based inference systems for scientific tasks have been constructed. With
02400
02500 each new success have come some deviations from the above principles [7]. Were
02600
02700 these mere aberrations, or is there some valid reason for such changes in
02800
02900 design?
03000
03100
03200 We claim the latter. The task domain -- scientific discovery -- dictates a new
03300
03400 and quite different architecture for RSs. To study this phenomenon, we shall
03500
03600 describe, in the next section, one particular RS which defines new mathematical
03700
03800 concepts, studies them, and conjectures relationships between them. Subsequent
03900
04000 sections will explore the deviations of its design from the neo-classical
04100
04200 constraints in Figure 1.
04300
04400
04500 [%%190]
04600 3. @u['AM']: A Rule System For Math Theory Formation
04700
04800
04900 A recent thesis [13] describes a program, called "AM", which gradually expands a
05000
05100 base of mathematical knowledge. The representation of math facts is somewhat
00100 Lenat & Harris p. 7
00200
00300
00400 [%%7]
00500 related to Actors [10] and Beings [12] in the partitioning of such domain
00600
00700 knowledge into effective, structured modules. Departing from the traditional
00800
00900 [%%116]
01000 control structures usually associated with Actors, Beings, and Frames
01100
01200 [15], AM concentrates on one "interesting" mini-research question after another.
01300
01400 [%%130]
01500 These "jobs" are proposed by -- and rated by -- a collection of
01600
01700 approximately 250 situation-action rules. Discovery in mathematics is modelled
01800
01900 in AM as a rule-guided exploration process. This view is explained below in
02000
02100 Section 3.1 (See also [21].) The representation of knowledge is sketched next,
02200
02300 followed by a much more detailed description of the rule-based control structure
02400
02500 of AM. Finally, in Section 3.5, the experimental results of the project are
02600
02700 summarized.
02800
02900
03000 3.1. Discovery in Mathematics as Heuristic Rule-Guided Search
03100
03200
03300 [%%190]
03400 The task which AM performs is the discovery of new mathematics concepts
03500
03600 and relationships between them. The simple paradigm it follows for this task is
03700
03800 [%%71]
03900 to maintain a graph of partially-developed concepts , and to obey a large
04000
04100 [%%166]
04200 collection of "heuristics" (rules which frequently lead to discoveries)
04300
04400 which guide it to define and study the most plausible thing next.
04500
04600
04700 For example, at one point AM had some notions of sets, set-operations, numbers,
04800
04900 and simple arithmetic. One heuristic rule it knew said @i["If f is an
05000
05100 interesting relation, Then look at its inverse"]. This rule fired after AM had
00100 Lenat & Harris p. 8
00200
00300
00400 studied "multiplication" for a while. The rhs of the rule then directed AM to
00500
00600 define and study the relation "divisors-of" (e.g., divisors-of(12) =
00700
00800 {1,2,3,4,6,12}). Another heuristic rule which later fired said @i["If f is a
00900
01000 relation from A into B, then it's worth examining those members of A which map
01100
01200 into @u[extremal] members of B"]. In this case, f was matched to "divisors-of",
01300
01400 A was "numbers", B was "sets of numbers", and an extremal member of B might be,
01500
01600 e.g., a very @i[small] set of numbers. Thus this heuristic rule caused AM to
01700
01800 define the set of numbers with no divisors, the set of numbers with only 1
01900
02000 divisor, with only 2 divisors, etc. One of these sets (the last one mentioned)
02100
02200 turned out subsequently to be quite important; these numbers are of course the
02300
02400 primes. The above heuristic also directed AM to study numbers with very
02500
02600 @i[many] divisors; such highly-composite numbers were also found to be
02700
02800 interesting.
02900
03000
03100 [%%69]
03200 This same paradigm enabled AM to discover concepts which were much more
03300
03400 primitive (e.g., cardinality) and much more sophisticated (e.g., the fundamental
03500
03600 theorem of arithmetic) than prime numbers. We shall now describe the AM program
03700
03800 in more detail.
03900
04000
04100 [%%165]
04200 3.2. Representation of Mathematical Knowledge
04300
04400
04500 What exactly does it mean for AM to "have the notion of" a concept? It means
04600
04700 [%%116]
04800 that AM possesses a frame-like data structure for that concept. For
04900
05000 [%%71]
05100 instance, here is how one concept looked after AM had defined and explored
05200
05300 it:
00100 Lenat & Harris p. 9
00200
00300
00400 FIGURE 2: A Typical Concept
00500
00600 NAME: Prime Numbers
00700 DEFINITIONS:
00800 ORIGIN: Number-of-divisors-of(x) = 2
00900 PREDICATE-CALCULUS: Prime(x) iff
01000 (For-all z) (z|x implies z=1 XOR z=x)
01100 ITERATIVE: (for x>1): For i from 2 to x-1, not (i|x)
01200 EXAMPLES: 2, 3, 5, 7, 11, 13, 17
01300 BOUNDARY: 2, 3
01400 BOUNDARY-FAILURES: 0, 1
01500 FAILURES: 12
01600 GENERALIZATIONS: Numbers, Numbers with an even number of divisors,
01700 Numbers with a prime number of divisors
01800 SPECIALIZATIONS: Odd Primes, Prime Pairs, Prime Uniquely-addables
01900 CONJECS: Unique factorization, Goldbach's conjecture, Extrema of Divisors-of
02000 ANALOGIES: Maximally-divisible numbers are converse extremes of Divisors-of
02100 INTEREST: Conjec's tying Primes to Times, to Divisors-of, to closely related
02200 WORTH: 800
02300
02400
02500
02600 3.3. Top-level Control: An Agenda of Promising Questions
02700
02800
02900 AM was initially given a collection of 115 core concepts, with only a few facets
03000
03100 (i.e., slots) filled in for each. AM repeatedly chooses some facet of some
03200
03300 concept, and tries to fill in some entries for that particular slot. To decide
03400
03500 which such job to work on next, AM maintains an @i[agenda] of jobs, a global
03600
03700 queue ordered by priority [2]. A typical job is @i["Fill-in examples of
03800
03900 Primes"]. The agenda may contain hundreds of entries such as this one. AM
04000
04100 repeatedly selects the top job from the agenda and tries to carry it out. This
04200
04300 is the whole control structure! Of course, we must still explain how AM creates
04400
04500 plausible new jobs to place on the agenda, how AM decides which job will be the
04600
04700 best one to execute next, and how it carries out a job.
04800
04900
05000 If the job were @i["Fill in new Algorithms for Set-union"], then @i[satisfying]
00100 Lenat & Harris p. 10
00200
00300
00400 it would mean actually synthesizing some new procedures, some new LISP code
00500
00600 capable of forming the union of any two sets. A heuristic rule is @i[relevant]
00700
00800 to a job if and only if executing that rule brings AM closer to satisfying that
00900
01000 job. Potential relevance is determined @i[a priori] by where the rule is stored.
01100
01200 A rule tacked onto the Domain/range facet of the Compose concept would be
01300
01400 presumed potentially relevant to the job @i["Fill in the Domain of Insert-]o@i[-
01500
01600 Delete"]. The lhs of each potentially relevant rule is evaluated to determine
01700
01800 whether the rule is truly relevant.
01900
02000
02100 Once a job is chosen from the agenda, AM gathers together all the potentially
02200
02300 [%%78]
02400 relevant heuristic rules -- the ones which might accomplish that job.
02500
02600 They are executed, and then AM picks a new job. While a rule is executing,
02700
02800 three kinds of actions or effects can occur:
02900
03000 @i[(i)] Facets of some concepts can get filled in (e.g., examples of primes may
03100 actually be found and tacked onto the "Examples" facet of the "Primes"
03200 concept). A typical heuristic rule which might have this effect is:
03300
03400 If examples of X are desired, where X is a kind of Y (for some more
03500 general concept Y),
03600 Then check the examples of Y; some of them may be examples of X as well.
03700
03800 For the job of filling in examples of Primes, this rule would have AM notice
03900 that Primes is a kind of Number, and therefore look over all the known
04000 examples of Number. Some of those would be primes, and would be transferred
04100 to the Examples facet of Primes.
04200
04300 @i[(ii)] New concepts may be created (e.g., the concept "primes which are
04400 uniquely representable as the sum of two other primes" may be somehow be
04500 deemed worth studying). A typical heuristic rule which might result in this
04600 new concept is:
04700
04800 If some (but not most) examples of X are also examples of Y (for some
04900 concept Y),
05000 Then create a new concept defined as the intersection of those 2
05100 concepts (X and Y).
00100 Lenat & Harris p. 11
00200
00300
00400 Suppose AM has already isolated the concept of being representable as the
00500 sum of two primes in only one way (AM actually calls such numbers "Uniquely-
00600 prime-addable numbers"). When AM notices that some primes are in this set,
00700 the above rule will create a brand new concept, defined as the set of
00800 numbers which are both prime and uniquely prime addable.
00900
01000 @i[(iii)] New jobs may be added to the agenda (e.g., the current activity may
01100 suggest that the following job is worth considering: "Generalize the concept
01200 of prime numbers"). A typical heuristic rule which might have this effect
01300 is:
01400
01500 If very few examples of X are found,
01600 Then add the following job to the agenda: "Generalize the concept X".
01700
01800
01900 The concept of an agenda is certainly not new: schedulers have been around for a
02000
02100 long time. But one important feature of AM's agenda scheme @i[is] a new idea:
02200
02300 attaching -- and using -- a list of quasi-symbolic reasons to each job which
02400
02500 explain why the job is worth considering, why it's plausible. It is the
02600
02700 responsibility of the heuristic rules to include reasons for any jobs they
02800
02900 propose. For example, let's reconsider the heuristic rule mentioned in @i[(iii)]
03000
03100 above. It really looks more like the following:
03200
03300 If very few examples of X are found,
03400 Then add the following job to the agenda: "Generalize the
03500 concept X", for the following reason: "X's are quite
03600 rare; a slightly less restrictive concept might be more
03700 interesting".
03800
03900
04000 If the same job is proposed by several rules, then several different reasons for
04100
04200 it may be present. In addition, one ephemeral reason also exists: "Focus of
04300
04400 attention" [9]. Any jobs which are related to the one last executed get "Focus
04500
04600 of attention" as a bonus reason. AM uses all these reasons to decide how to
04700
04800 rank the jobs on the agenda. Each reason is given a rating (by the heuristic
04900
05000 which proposed it), and the ratings are combined into an overall priority rating
00100 Lenat & Harris p. 12
00200
00300
00400 for each job on the agenda. The jobs are ordered by these ratings, so it is
00500
00600 trivial to select the job with the highest rating. Note that if a job already on
00700
00800 the agenda is re-proposed for a new reason, then its priority will increase. If
00900
01000 the job is re-proposed for an already-present reason, however, the overall
01100
01200 rating of the job will @i[not] increase. This turned out to be an important
01300
01400 enough phenomenon that it was presented in [13] as a necessary design
01500
01600 constraint.
01700
01800
01900 AM uses each job's list of reasons in other ways. Once a job has been selected,
02000
02100 the quality of the reasons is used to decide how much time and space the job
02200
02300 will be permitted to absorb, before AM quits and moves on to a new job. Another
02400
02500 use is to explain to the human observer precisely why the chosen top job is a
02600
02700 plausible thing for AM to concentrate upon.
00100 Lenat & Harris p. 13
00200
00300
00400 3.4. Low-level Control: A Lattice of Heuristic Rules
00500
00600
00700 The hundreds of concepts AM possesses are interrelated in many ways. One main
00800
00900 [%%120]
01000 organization is that provided by their Generalization and Specialization
01100
01200 facets. The concepts may be viewed as nodes on a large lattice whose edges are
01300
01400 labelled Genl/Spec. The importance of this organization stems from various
01500
01600 [%%138]
01700 @i[heritability] properties. For example, Spec is transitive, so the
01800
01900 specializations of Numbers include not only Primes but all @i[its]
02000
02100 specializations as well.
02200
02300
02400 Let us describe a second, very important heritability property. Each of the 250
02500
02600 heuristic rules is attached to the most general (or abstract) concept for which
02700
02800 it is deemed appropriate. The relevance of heuristic rules is assumed to be
02900
03000 inherited by all its specializations. For example, a heuristic method which is
03100
03200 capable of inverting any function will be attached to the concept "Function";
03300
03400 but it is certainly also capable of inverting any permutation. If there are no
03500
03600 known methods specific to the latter job, then AM will follow the Genl links
03700
03800 upward from Permutation to Bijection to Function..., seeking methods for
03900
04000 inversion. Of course the more general concepts' methods tend to be weaker than
04100
04200 those of the specific concepts.
04300
04400
04500 In other words, the Genl/Spec graph of concepts induces a graph structure upon
04600
04700 the set of heuristic rules. This permits potentially relevant rules to be
04800
04900 located efficiently. Here is one more example of how this heritability works in
05000
05100 practice: Immediately after the job "Fill in examples of Set-equality" is
00100 Lenat & Harris p. 14
00200
00300
00400 chosen, AM asks each generalization of Set-equality for help. Thus it asks for
00500
00600 ways to fill in examples of any Predicate, any Activity, any Concept, and
00700
00800 finally for ways to fill in examples of Anything. One such heuristic rule known
00900
01000 to the Activity concept says: @i["If examples of the domain of the activity f
01100
01200 are already known, Then actually execute f on some random members of its
01300
01400 domain."] Thus when AM applies this heuristic rule to fill in examples of Set-
01500
01600 equality, its Domain facet is inspected, and AM notes that Set-equality takes a
01700
01800 pair of sets as its arguments. Then AM accesses the Examples facet of the
01900
02000 concept Set, where it finds a large list of sets. The lhs is thus satisfied, so
02100
02200 the rule is fired. Obeying the heuristic rule, AM repeatedly picks a pair of
02300
02400 the known sets at random, and sees if they satisfy Set-equality (by actually
02500
02600 running the LISP function stored in the Algorithms facet of Set-equality).
02700
02800 While this will typically return False, it will occasionally locate -- by random
02900
03000 chance -- a pair of equal sets.
03100
03200
03300 Other heuristics, tacked onto other generalizations of Set-equality, provide
03400
03500 additional methods for executing the job "Fill in examples of Set-equality." A
03600
03700 heuristic stored on the concept Any-concept says to symbolically instantiate the
03800
03900 definition. After spending much time manipulating the recursive definition of
04000
04100 Set-equality, a few trivial examples (like {}={}) are produced. Notice that (as
04200
04300 expected) the more general the concept is, the weaker (more time-consuming, less
04400
04500 chance for success) its heuristics tend to be. For this reason, AM consults
04600
04700 each concept's rules in order of increasing generalization.
00100 Lenat & Harris p. 15
00200
00300
00400 3.5. Behavior of this Rule System
00500
00600
00700 As the preceding four sections indicate, the dynamic behavior of AM was as
00800
00900 follows: a job is chosen from the agenda, potentially relevant rules are located
01000
01100 by their position in the Genl/Spec lattice, their lhs's (left-hand sides) are
01200
01300 evaluated to find those which actually trigger, they are then executed (in order
01400
01500 of decreasing specificity) until they are all executed (or until some @i[local],
01600
01700 self-imposed limit on time or space is exceeded), and the cycle repeats. AM has
01800
01900 a modest facility that prints out a description of these activities as they
02000
02100 occur. Here is a tiny excerpt of this self-trace monologue.
02200
02300 ** @u[Job 65:] ** Fill in Examples of the concept "Divisors-of".
02400
02500 3 Reasons: (1) No known examples of Divisors-of so far.
02600 (2) TIMES, which is related to Divisors-of, is now very interesting
02700 (3) Focus of attention: AM recently defined Divisors-of.
02800
02900 26 examples found, in 9.2 seconds. e.g., Divisors-of(6)={1 2 3 6}.
03000
03100
03200 ** @u[Job 66:] ** Consider numbers having small sets of Divisors-of.
03300
03400 2 Reasons: (1) Worthwhile to look for extreme cases.
03500 (2) Focus of attention: AM recently worked on Divisors-of.
03600
03700 Filling in examples of numbers with 0 divisors.
03800 0 examples found, in 4.0 seconds.
03900 [%%144]
04000 Conjecture: no numbers have precisely 0 divisors.
04100
04200 Filling in examples of numbers with 1 divisors.
04300 1 examples found, in 4.0 seconds. e.g., Divisors-of(1) = {1}.
04400 [%%173]
04500 Conjecture: 1 is the only number with precisely 1 divisor.
04600
04700 Filling in examples of numbers with 2 divisors.
04800 24 examples found, in 4.0 seconds. e.g., Divisors-of(13) = {1 13}.
04900 No obvious conjecture. May merit more study.
05000 Creating a new concept: "Numbers-with-2-divisors".
05100
05200 Filling in examples of numbers with 3 divisors.
05300 11 examples found, in 4.0 seconds. e.g., Divisors-of(49) = {1 7 49}.
00100 Lenat & Harris p. 16
00200
00300
00400 All numbers with 3 divisors are also Squares. Definitely merits more study.
00500 Creating a new concept: "Numbers-with-3-divisors".
00600
00700
00800 ** @u[Job 67:] ** Consider the square-roots of Numbers-with-3-divisors.
00900
01000 2 Reasons: (1) Numbers-with-3-divisors are unexpectedly also Perfect Squares
01100 (2) Focus of attention: AM recently worked on Nos-with-3-divisors.
01200
01300 All square-roots of Numbers-with-3-divisors seem to be Numbers-with-2-divisors
01400 e.g., Divisors-of(Square-root(169)) = Divisors-of(13) = {1 13}.
01500 Even the converse of this seems empirically to be true.
01600 i.e., the square of each No-with-2-divisors seems to be a No-with-3-divs
01700 The chance of coincidence is below acceptable limits.
01800 Boosting the interestingness rating of each of the concepts involved.
01900
02000
02100 ** @u[Job 68:] ** Consider the squares of Numbers-with-3-divisors.
02200
02300 3 Reasons: (1) Squares of Numbers-with-2-divisors were interesting.
02400 (2) Square-roots of Numbers-with-3-divisors were interesting.
02500 (3) Focus of attention: AM recently worked on Nos-with-3-divisors.
02600
02700
02800
02900
03000 Now that we've seen how AM works, and we've been exposed to a bit of "local"
03100
03200 results, let's take a moment to discuss the totality of the mathematics which AM
03300
03400 carried out. AM began its investigations with scanty knowledge of a hundred
03500
03600 elementary concepts of finite set theory. Most of the obvious set-theoretic
03700
03800 concepts and relationships were quickly found (e.g., de Morgan's laws;
03900
04000 singletons), but no sophisticated set theory was ever done (e.g.,
04100
04200 diagonalization). Rather, AM discovered natural numbers and went off exploring
04300
04400 elementary number theory. Arithmetic operations were soon found (as
04500
04600 [%%15]
04700 analogs to set-theoretic operations), and AM made surprising progress in
04800
04900 divisibility theory. Prime pairs, Diophantine equations, the unique
05000
05100 factorization of numbers into primes, Goldbach's conjecture -- these were some
00100 Lenat & Harris p. 17
00200
00300
00400 of the nice discoveries by AM. Many concepts which we know to be crucial were
00500
00600 @r[1]
00700 never uncovered, however: remainder , gcd, greater-than, infinity, proof,
00800
00900 etc.
01000
01100
01200 All the discoveries mentioned were made in a run lasting one cpu hour
01300
01400 (Interlisp+100k, SUMEX PDP-10 KI). Two hundred jobs in toto were selected from
01500
01600 the agenda and executed. On the average, a job was granted 30 cpu seconds, but
01700
01800 actually used only 18 seconds. For a typical job, about 35 rules were located
01900
02000 as potentially relevant, and about a dozen actually fired. AM began with 115
02100
02200 concepts and ended up with three times that many. Of the synthesized concepts,
02300
02400 half were technically termed "losers" (both by the author and by AM), and half
02500
02600 the remaining ones were of only marginal interest.
02700
02800
02900 Although AM fared well according to several different measures of performance
03000
03100 (see Section 7.1 in [13]), of greatest significance are its @i[limitations].
03200
03300 This subsection will merely report them, and the next section will analyze
03400
03500 whether they were caused by radical departures from the neo-classical
03600
03700 production-system architecture, or from departing not far enough from that early
03800
03900 design.
04000
04100
04200 As AM ran longer and longer, the concepts it defined were further and further
04300
04400 from the primitives it began with. Thus "prime-pairs" were defined using
04500
04600 "primes" and "addition", the former of which was defined from "divisors-of",
04700
04800
04900 ---------------
05000 1
05100 This concept, and many of the other "omissions", @u[could] have been
05200 discovered by the existing heuristic rules in AM. The paths which would have
05300 resulted in their definition were simply never rated high enough to explore.
00100 Lenat & Harris p. 18
00200
00300
00400 which in turn came from "multiplication", which arose from "addition", which was
00500
00600 defined as a restriction of "union", which (finally!) was a primitive concept
00700
00800 (with heuristics) that we had supplied to AM initially. When AM subsequently
00900
01000 needed help with prime pairs, it was forced to rely on rules of thumb supplied
01100
01200 originally about @i[union]ing. Although the heritability property of heuristics
01300
01400 did ensure that those rules were still valid, the trouble was that they were too
01500
01600 general, too weak to deal effectively with the specialized notions of primes and
01700
01800 arithmetic. For instance, one general rule indicated that A union B would be
01900
02000 interesting if it possessed properties absent both from A and from B. This
02100
02200 translated into the prime-pair case as "@i[If p+q=r, and p,q,r are primes, Then
02300
02400 r is interesting if it has properties not possessed by p or by q.]" The search
02500
02600 for categories of such interesting primes @i[r] was of course barren. It showed
02700
02800 a fundamental lack of understanding about numbers, addition, odd/even-ness, and
02900
03000 primes.
03100
03200
03300 As the derived concepts moved further away from finite set theory, the efficacy
03400
03500 of the initial heuristics decreased. AM began to "thrash", appearing to lose
03600
03700 most of its heuristic guidance. It worked on concepts like "prime triples",
03800
03900 which is not a rational thing to investigate. The key deficiency was the lack
04000
04100 of adequate @i[meta]-rules[6]: heuristics which cause the creation and
04200
04300 modification of new heuristics.
04400
04500
04600 Aside from the preceding major limitation, most of the other problems pertain to
04700
04800 missing knowledge. Many concepts one might consider basic to discovery in math
04900
05000 are absent from AM; analogies were under-utilized; physical intuition was
05100
05200 absent; the interface to the user was far from ideal; etc.
00100 Lenat & Harris p. 19
00200
00300
00400 4. Reexamining the Design
00500
00600
00700 Let us now consider the major components of a RS's design and how AM treated
00800
00900 them: the DS, the rules, the distribution of knowledge between DS and rules, and
01000
01100 the rule interpretation policy. For each component, AM's architecture failed to
01200
01300 adhere strictly to the pure RS guidelines. Were these departures worth the loss
01400
01500 of simplicity? Were the deviations due to the task domain (scientific
01600
01700 discovery), to the task view (heuristically guided growth of structured
01800
01900 theories), or to other sources? These are the kinds of questions we shall
02000
02100 address in each of the following subsections.
02200
02300
02400 4.1. Data Structures
02500
02600
02700 We recognize that a single uniform DS (e.g., an infinite STM [19]) is universal
02800
02900 in the Turing sense of being @i[formally] adequate: One can encode any
03000
03100 representation in a linear, homogeneous DS. The completeness of such a DS
03200
03300 design not withstanding, we believe that encouraging several distinct, special-
03400
03500 purpose DSs will enhance the performance of a discovery system. That is, we are
03600
03700 [%%100]
03800 willing to sacrifice aesthetic purity of DSs for clarity, efficiency, and
03900
04000 power. In this section we will explore this tradeoff.
04100
04200
04300 The data structures used in AM are unlike the uniform memories suggested by the
04400
04500 first design constraint (see Figure 1). One DS -- the agenda -- holds an ordered
04600
04700 list of plausible questions for the system to concentrate on, a list of jobs to
04800
04900 work on. Another DS is the graph of concepts AM knows about. Each concept
05000
05100 itself consists in much structured information (see Figure 2). The reasons AM
00100 Lenat & Harris p. 20
00200
00300
00400 has for each job have information associated with them. Still other information
00500
00600 is present as values of certain functions and global variables: the cpu clock,
00700
00800 the total number of concepts, the last thing typed out to the user, the last few
00900
01000 concepts worked on, etc. All these types of information are accessed by the
01100
01200 lhs's (left hand sides) of heuristic rules, and affected by rhs's (some
01300
01400 "deliberately" in the text of the rule, some "incidentally" through a chain of
01500
01600 if-added methods).
01700
01800
01900 Why is there this multitude of diverse DSs? Each type of knowledge (jobs, math
02000
02100 knowledge, system status) needs to be treated quite differently. Since the
02200
02300 primitive operations will vary with the type of information, so should the DS.
02400
02500 For @u[jobs], the primitive kinds of accesses will be: picking the highest-rated
02600
02700 job, deleting the lowest-rated one, reordering some jobs, merging new ones. A
02800
02900 natural choice to make these operations efficient is to keep the system's goals
03000
03100 in a queue ordered by their rating or partially-ordered by those ratings that
03200
03300 are commensurable. For @u[resource information], the usual request is for some
03400
03500 @i[statistic] of some class of primary data. To maintain a table of such
03600
03700 summary facts (like how much the CPU clock has run so far, or how many concepts
03800
03900 there are) is to introduce an unnecessary DS and incur exorbitant costs to
04000
04100 maintain many @i[short-lived] entries that will, most probably, never be used.
04200
04300 It is far more reasonable to run a summarizing procedure to develop just that
04400
04500 ephemeral, up-to-date information that you need. For @u[math concepts], we have
04600
04700 a much less volatile situation. We view them as an ever-growing body of highly-
04800
04900 interrelated facts. Knowledge in this form is stable and rarely deleted. When
05000
05100 new knowledge is added, a great many "routine" inferences must be drawn. In a
00100 Lenat & Harris p. 21
00200
00300
00400 uniform, linear memory, each would have to be drawn explicitly; in a structured
00500
00600 one (as the Genl/Spec graph structure provides) they may be accomplished through
00700
00800 the tacit (analogical) characteristics of the representation, simply by deciding
00900
01000 @i[where] to place the information.
01100
01200
01300 Each kind of knowledge dictates a set of appropriate kinds of primitive
01400
01500 operations to be performed on it, which in turn suggest natural data structures
01600
01700 in which to realize it. The generality of this perspective on rule-based systems
01800
01900 is made more plausible by examining other RSs which deal with many types of
02000
02100 knowledge (e.g., [5]). If this is so, if the design proceeds from "knowledge to
02200
02300 be represented" to "a data structure to hold it", then fixing @i[a priori] the
02400
02500 capabilities of the DS access primitives available to rules is suspect.
02600
02700
02800 Therefore, we advocate the opposite: the RS designer is encouraged to name every
02900
03000 combination of "machine" operations that together comprise a single conceptual
03100
03200 access of data by rules. In AM, it is quite reasonable to expect that a request
03300
03400 like "find all generalizations of a given concept" would be such a primitive
03500
03600 (i.e., could be referred to by name). Even though it might cause the "machine"
03700
03800 (in this case, LISP) to run around the Genl/Spec graph, a single rule can treat
03900
04000 this as merely an "access" operation. The use of complex tests and actions is
04100
04200 not new; we simply claim that it is @i[always] preferable to package knowledge
04300
04400 (for which a reasonably fast algorithm is available) as a single action (though
04500
04600 it may have side-effects in the space of concepts) or a single test (so long as
04700
04800 its sole side-effect -- modulo caches -- is to signal). Primitive tests and
04900
05000 actions should be maximally algorithmic, not minimally computational.
00100 Lenat & Harris p. 22
00200
00300
00400 The neo-classical view of designing a production rule system was that of
00500
00600 defining a machine. Our present view is that RSs do not @i[compute] so much as
00700
00800 they @i[guide attention]. In adopting this view (thereby separating the
00900
01000 controller from the effector), we recognize that we are giving up an attractive
01100
01200 feature of pure rule systems: a homogeneous basis for definition. For example,
01300
01400 the rule system designer must now spell out in detail the definitions of the DS
01500
01600 accessing functions; but the designer of a neo-classical RS is simply able to
01700
01800 take as @i[givens] the matching and inserting operations (as specified in neo-
01900
02000 classical principle #6, Figure 1), and he builds each more complicated one out
02100
02200 @r[2]
02300 of these primitives . In giving up the old view of the RS as an abstract
02400
02500 computing machine, the RS designer must use another homogeneous substrate (e.g.,
02600
02700 LISP) in terms of which to define his DSs and especially the procedures that
02800
02900 process them. In exchange, he obtains a clear distinction between two kinds of
03000
03100 knowledge contained in the neo-classical rule: plausible proposals for what to
03200
03300 do next, and how to accomplish what might be proposed.
03400
03500
03600 We have seen that admitting complicated and varied DSs leads to stylized sets of
03700
03800 DS accesses. The DSs and their sets of read/write primitives must in turn be
03900
04000 explicitly defined (coded) by the designer. This seems like a high price to pay.
04100
04200 Is there any bright side to this? Yes, one rather interesting possibility is
04300
04400 opened up. Not only the RS designer, but the RS @i[itself] may define DSs and DS
04500
04600 access functions. In AM, this might take the form of dynamically defining new
04700
04800
04900 ---------------
05000 2
05100 Either by stringing out a sequence of primitives on one side of a rule, or by
05200 handcrafting a tightly coupled bundle of rules (so firing such a rule would
05300 simulate traversing one link of the kind that abound in AM's DSs).
00100 Lenat & Harris p. 23
00200
00300
00400 kinds of facets (slots). E.g., after "injective Function" is defined, and after
00500
00600 some properties of it have been discovered, it would be appropriate to introduce
00700
00800 a new facet called "inverse" for each (concept representing an) injective
00900
01000 function. In AM, the actual definitions of the facets of every concept are
01100
01200 complex enough (shared structure), inter-related enough (shared meaning), and
01300
01400 interesting enough (consistent heuristic worth) that a special concept was
01500
01600 included for each one (e.g., a concept called "Examples") which contained a
01700
01800 definition, description,... of the facet. Thus the same techniques for
01900
02000 manipulating and discovering math concepts may be applied to DS design concepts.
02100
02200 Not only do math theories emerge, so can new DS access functions (new slots;
02300
02400 e.g., "Small Boundary Examples", "Factorization", or "Inverse").
02500
02600
02700 It should be noted that in opting for non-uniform DSs, we have not in general
02800
02900 [%%100]
03000 sacrificed efficiency. One has only to compare the time to access a node
03100
03200 in a tree, versus in a linear list, to appreciate that efficiency may, in fact,
03300
03400 be @i[increased] by non-uniformity.
03500
03600
03700 Just how tangled up a DS should we tolerate? Should memory elements be
03800
03900 permitted to refer to (to "know about") each other? We believe the answer to
04000
04100 depend upon the @i[type] of data structure involved. For the homogeneous DS
04200
04300 called for in the neo-classical design, much simplicity is preserved by
04400
04500 forbidding this kind of interrelationship. But consider a DS like AM's graph of
04600
04700 concepts. It is growing, analogically interrelated, and it contains descriptions
04800
04900 of its elements. This richness (and sheer quantity) of information can be coded
05000
05100 only inefficiently in a uniform, non-self-referential manner. For another
00100 Lenat & Harris p. 24
00200
00300
00400 example, consider AM's agenda of jobs. One reason for a job may simply be the
00500
00600 existence of some other job. In such a case, it seems natural for part of one
00700
00800 entry on the agenda (a reason part of one job) to point to another entry in the
00900
01000 same DS (point to another specific job on the agenda). Thus, inter-element
01100
01200 pointers @i[are] allowed, even though they blur a "pure" distinction between a
01300
01400 @r[3]
01500 DS and its entries. Inter-element references play a necessary role in
01600
01700 organizing large bodies of highly interrelated information into structured
01800
01900 modules.
02000
02100
02200 There is yet another motivation for special-purpose DSs when the task of the RS
02300
02400 includes sensing an external environment. Using a uniform memory, external
02500
02600 stimuli are dumped into the working memory and rub shoulders with all the other
02700
02800 data. They must then be distinguished from the others. ("Must" because to
02900
03000 freely intermingle what one sees or is told with what one thinks or remembers is
03100
03200 to give way to endless confusion.) How much cleaner, less distracting, and safer
03300
03400 it is for stimuli to arive in their own special place -- a place which might
03500
03600 well be a special purpose store such as an intensity @i[array] (not even a list
03700
03800 structure at all), or a low-level speech-segment @i[queue]. A linear memory
03900
04000 (e.g., an infinite STM) is of course adequate; one could tag each incoming
04100
04200 environmental stimulus with a special flag. But the design philosophy we are
04300
04400 proposing is aimed at maximizing clarity and efficiency, not uniformity or
04500
04600 universality.
04700
04800
04900
05000
05100 ---------------
05200 3
05300 In section 4.3 we will mention work that blurs this distinction even further.
00100 Lenat & Harris p. 25
00200
00300
00400 We know that this view of DSs means making a specialized design effort for each
00500
00600 class of knowledge incorporated into the RS. But that is desirable, as it buys
00700
00800 us three things: @i[(i)] system performance is increased, @i[(ii)] some forms of
00900
01000 automatic learning are facilitated, @i[(iii)] knowledge is easier to encode.
01100
01200
01300 [%%225]
01400 4.2. Rules
01500
01600
01700 In the "pure" view of RSs, the rule store is not a full-fledged DS of the RS.
01800
01900 For example, in Waterman's [24] poker player, rules may not be deleted.
02000
02100 Rychener [22] states that the only way his RS may inspect rules is by examining
02200
02300 the effect of those rules which have recently fired. Although AM had no
02400
02500 explicit taboo against inspecting rules, such analyses were in practice never
02600
02700 possible, since the rules were @i[ad hoc] blocks of LISP code. This eventually
02800
02900 turned out to be the main limitation of the design of AM. The ultimate
03000
03100 impediment to further discovery was the lack of rules which could reason about,
03200
03300 modify, delete, and synthesize other rules. AM direly needed to synthesize
03400
03500 specialized forms of the given general heuristic rules (as new concepts arose;
03600
03700 see the end of 3.5.)
03800
03900
04000 We want our heuristic rules to be added, kept track of, reasoned about,
04100
04200 modified, deleted, generalized, specialized, ... whenever there is a good
04300
04400 reason to do so. Note that those situations may be very different from the ones
04500
04600 in which such a rule might fire. E.g., upon discovering a new, interesting
04700
04800 concept, AM should try to create some specially-tailored heuristic rules for it.
04900
05000 They wouldn't actually @i[fire] until much later, when their lhs's were
00100 Lenat & Harris p. 26
00200
00300
00400 triggered. After having constructed such rules, AM might subject them to
00500
00600 criticism and improvement as it explores the new concept.
00700
00800
00900 In sum, we have found that the discovery of heuristic rules for using new math
01000
01100 concepts is a necessary part of the growth of math knowledge. Hence, following
01200
01300 the argument in 4.1, the rules themselves should be DSs, and each rule might be
01400
01500 described by a concept with effective (executable) and non-effective (purely
01600
01700 descriptive) facets. This lesson was made all the more painful because it was
01800
01900 not new [5]. Apparently the need for reasoning about rules is common to many
02000
02100 tasks.
02200
02300
02400 The current re-coding of AM does in fact have each rule represented as a
02500
02600 concept. What kinds of non-effective "facets" do they have? Recall that one of
02700
02800 the features of the original AM (as described in Section 3.3) was that with each
02900
03000 rule were associated some symbolic @i[reasons] which it could provide whenever
03100
03200 it proposed a new job for the agenda. So one kind of facet which every rule can
03300
03400 possess is "Reasons". What others are there? Some of them @i[describe] the rule
03500
03600 (e.g., its average cost); some facets provide a road map to the space of rules
03700
03800 (e.g., which rule schemata are mere specializations of the given one); some
03900
04000 facets record its derivation (e.g., the rule was proposed as an analog to rule X
04100
04200 because ...), its redundancy (some other rules need not be tried if this one
04300
04400 is), etc.
04500
04600
04700 There are some far-reaching consequences of the need to reason about rules just
04800
04900 as if they were any other concepts known to AM. When one piece of knowledge
00100 Lenat & Harris p. 27
00200
00300
00400 [%%245]
00500 relates to several rules, then one general concept, a rule @i[schema],
00600
00700 should exist to hold that common knowledge. Since each rule is a concept, there
00800
00900 will be a natural urge to exploit the same Genl/Spec organization that proved so
01000
01100 useful before. Heritability still holds; e.g., any reason which explains rule R
01200
01300 is also somehow a partial explanation of each specialization of R.
01400
01500
01600 Rule schemata have cause to exist simply because they generalize -- and hold
01700
01800 much information which would otherwise have to be duplicated in -- several
01900
02000 specific rules. They may tend to be "big" and less directly productive when
02100
02200 executed, yet they are of value in capturing the essence of the discovery
02300
02400 @r[4]
02500 techniques. We put "big" in quotes because sheer length (total number of
02600
02700 lhs tests allowed, total number of rhs actions) is not directly what we're
02800
02900 talking about here. A general rule schema will capture many regularities, will
03000
03100 express an idea common to several more specific rules. It will contain dual
03200
03300 forms of the same rule, sophisticated types of variable-binding (for the
03400
03500 duration of the rule application), and searching may even be required to find
03600
03700 the actions of such a general rule. We may even wish to consider every rule in
03800
03900 the RS as a rule schema of some level of generality, and much processing may go
04000
04100 on to find the particular instance(s) of it which should be applied in any
04200
04300 particular situation.
04400
04500
04600 Let us consider a rule schema called the "rule of enthusiasm". It
04700
04800
04900 ---------------
05000 4
05100 In AM, even the specific rules may be "big" in the sense that their very
05200 precise knowledge may involve much testing to trigger and, once triggered, may
05300 conclude some elaborate results.
00100 Lenat & Harris p. 28
00200
00300
00400 [%%63]
00500 subsumes several rules in the original AM system (pp. 247-8 of [13]),
00600
00700 e.g., those that said:
00800
00900 If concept G is now very interesting, and G was created as a
01000 generalization of some earlier concept C,
01100 Give extra consideration to generalizing G, and to generalizing C in
01200 other ways.
01300
01400
01500 and:
01600
01700 If concept S proved to be a dead-end, and S was created as a
01800 specialization of some earlier concept C,
01900 Give less consideration to specializing S, and to specializing C in
02000 other ways in the future.
02100
02200
02300 The proposed rule schema is:
02400
02500 If concept X has very @u[high/low] interest and X can be derived from
02600 some concept C by means @u[m],
02700 Give @u[more/less] consideration to finding (and elaborating) concepts
02800 derived from C, X (and their "neighbors") by @u[means analogous
02900 [%%15]
03000 to m].
03100
03200
03300 [%%268] [%%39]
03400 There are four variables to be matched and coordinated in the lhs
03500
03600 of this rule: a concept @u[X], the direction (high or low) of its extreme
03700
03800 interest rating, a derivation procedure @u[m] and an associated source concept
03900
04000 @u[C]. The action itself is to search for jobs of a certain type and give them
04100
04200 a corresponding (high or low) rating change. Three types of matching are
04300
04400 present: @i[(i)] ranging over a set of alternatives which are known at the time
04500
04600 the rule is written (e.g., the "high/low" alternative); @i[(ii)] ranging over a
04700
04800 set of alternatives which can be accessed easily at any moment the rule is run,
04900
05000 like the set of concepts and connections between them now in existence (e.g.,
00100 Lenat & Harris p. 29
00200
00300
00400 the variables X and C range over this kind of set); @i[(iii)] ranging over a set
00500
00600 of alternatives which must be heuristically searched for as part of the rule
00700
00800 execution (e.g., "analogous" and "neighbors" only make sense after a nontrivial
00900
01000 amount of searching has been performed).
01100
01200
01300 Since the "rule of enthusiasm" is very general, it will only be tried if no more
01400
01500 specific rules (such as the two which were listed just above it) are relevant at
01600
01700 the time. Ideally, the search to specify the action should create a new,
01800
01900 specialized form of the rule of enthusiasm to catch this situation and handle it
02000
02100 quickly, should it arise again. Note that versions of this schema that mention
02200
02300 generalization or specialization are also schemata (without any specification
02400
02500 search); they are simply less general schemata than the rule of enthusiasm
02600
02700 itself. Whenever a new subject for discovery gets defined, the abstract, hard-
02800
02900 [%%240]
03000 to-execute rule schemata can be specialized (compiled, refined , etc.)
03100
03200 into efficient heuristics for that subject.
03300
03400
03500 Another use of a rule schema might be to @i[name] a collection of neo-classical
03600
03700 rules that are coupled by together fulfilling a single function. Consider a
03800
03900 collection of rules which is tightly coupled, say to perform an iteration. Much
04000
04100 knowledge about the iteration loop as a whole may exist. Where is such
04200
04300 descriptive information to be stored and sought? Either it must be duplicated
04400
04500 for each of the coupled rules, or there must be a rule-like concept which "knows
04600
04700 about" the iteration as one coherent unit. We conclude that even if some
04800
04900 intertwined rules @i[are] kept separate, an extra rule (a schema) should exist
05000
05100 which (at least implicitly) has a rhs which combines them (by containing
00100 Lenat & Harris p. 30
00200
00300
00400 knowledge common to all of them). Thus rule schemata do more than just unify
00500
00600 general properties of rules: there must also be schemata of the kind that relate
00700
00800 function to mechanism.
00900
01000
01100 Another problem crops up if we consider what happens if one of the coupled rules
01200
01300 is modified. Often, some corresponding change should be made in all its
01400
01500 companions. For example, if a term is generalized (replacement of "prime" by
01600
01700 "number" everywhere) then the same substitution had probably better be done in
01800
01900 each rule with which this one is supposed to couple. What we are saying is
02000
02100 that, for RSs which modify their own rules, it can be dangerous to split up a
02200
02300 single conceptual process into a bunch of rules which interact in more or less
02400
02500 fixed ways when run, without continuing to reason about them as an integrity,
02600
02700 @i[like any other algorithm] composed of parts. Here again, we find pressure to
02800
02900 treat RSs as algorithms, not vice-versa.
03000
03100
03200 Finally, let us make a few irresistable observations. The whole notion of
03300
03400 coupling via meaningless tokens is aesthetically repugnant and quite contrary to
03500
03600 "pure" production system spirit. By "meaningless" we mean entries in DS that
03700
03800 provide a narrow hand-crafted channel of communication between two specific
03900
04000 @r[5]
04100 rules that therefore "know about each other". At the least, when a coupled
04200
04300 rule deposits some "intermediate-state" message in a DS, one would like that
04400
04500 message to be meaningful to many rules in the system, to have some significance
04600
04700 itself. We can see that entries in a DS have an expected meaning to the read
04800
04900
05000 ---------------
05100 5
05200 By contrast, a "meaningful" DS entry will embody a piece of information which
05300 is specific to the RS's task, not to the actual rules themselves.
00100 Lenat & Harris p. 31
00200
00300
00400 @r[6]
00500 access functions that examine the DS. If this purity is maintained, then
00600
00700 any apparent "coupling" would be merely superficial: each rule could stand alone
00800
00900 as a whole domain-dependent heuristic. Thus no harm should come from changing a
01000
01100 single rule, and more rules could be added that act on the "intermediate
01200
01300 message" of the coupling. Such meaningful, dynamic couplings should be
01400
01500 encouraged. Only the meaningless, tight couplings are being criticized here.
01600
01700
01800 4.3. Distribution of Knowledge Between Rules and DS
01900
02000
02100 A common "pure" idea is that all knowledge of substance ought to be
02200
02300 [%%165]
02400 represented as rules. Independent of such rules, the DS forms no
02500
02600 meaningful whole initially, nor has it any final interpretation. The "answer"
02700
02800 which the RS computes is not stored in the DS; rather, the answer consists in
02900
03000 @r[7]
03100 the process of rule firings. The DS is "just" an intermediate vehicle of
03200
03300 control information.
03400
03500
03600 Contrary to this, we say that rules ought to have a @i[ symbiotic] relationship
03700
03800 to DSs. The DSs hold meaningful domain-dependent information, and rules process
03900
04000 knowledge represented in them. For RSs designed to perform scientific research,
04100
04200 the DSs contain the theory, and the rules contain methods of theory formation.
04300
04400
04500 ---------------
04600 6
04700 Perhaps this "meaning" could even be expressed formally as an invariant which
04800 the write access functions for the DS must never violate.
04900
05000
05100 7
05200 The sequence of actions in time. In addition, perhaps, the "answer" may
05300 involve a few of their side-effects. E.g., (Respond 'YES').
00100 Lenat & Harris p. 32
00200
00300
00400 But much domain-dependent knowledge is conditional. E.g., "If n and m are
00500
00600 relatively prime and divide x, then so must nm". Shouldn't such If/Then
00700
00800 information be encoded as rules? We answer an emphatic @i[No]. Just as there is
00900
01000 a distribution of "all knowledge of substance" between rules and DSs, so too
01100
01200 must the conditional information be partitioned between them. We shall
01300
01400 illustrate two particular issues: (i) Much information can be stored implicitly
01500
01600 in DSs; (ii) Some conditional knowledge is inappropriate to store as rules.
01700
01800
01900 When designing a DS, it is possible to provide mechanisms for holding a vast
02000
02100 amount of information @i[implicitly]. In AM, e.g., the organization of concepts
02200
02300 into a Genl/Spec hierarchy (plus the assumed heritability properties; see 3.4)
02400
02500 permits a rule to ask for "all concepts more general than Primes" as if that
02600
02700 were a piece of data explicitly stored in a DS. In fact, only direct
02800
02900 generalizations are stored ("The immediate generalization of Primes is
03000
03100 Numbers"), and a "rippling" mechanism automatically runs up the Genl links to
03200
03300 assemble a complete answer. Thus the number of specific answers the DS can
03400
03500 provide is far greater than the number of individual items in the DS. True,
03600
03700 these DS mechanisms will use up extra time in processing to obtain the answer;
03800
03900 this is efficient since any particular request is very unlikely to be made. Just
04000
04100 as each rule knows about a general situation, of which it will only see a few
04200
04300 instances, that same quality (of wide potential applicability) is just as
04400
04500 valuable for knowledge in DSs. These are situations where, like Dijkstra's
04600
04700 multiplier [8], the mechanism must provide any of the consequences of its
04800
04900 knowledge quickly on demand, but in its lifetime will only be asked a few of
05000
05100 them.
00100 Lenat & Harris p. 33
00200
00300
00400 Now that we have seen how tacit information @i[can] be encoded into DSs, let us
00500
00600 see some cases where it @i[should] be -- i.e., where it is not appropriate to
00700
00800 encode it as rules of the system. Many things get called implication, and only
00900
01000 some of them correspond to rule application. For instance, there is logical
01100
01200 [%%52]
01300 entailment (e.g., if A and B then A), physical causation (e.g., if it
01400
01500 rains, then the ground will get wet), probable associations (e.g., if it is wet
01600
01700 underfoot, then it has probably been raining.) These all describe the way the
01800
01900 world is, not the way the perceiver of the world behaves. Contrast them with
02000
02100 knowledge of the form "If it is raining, then open the umbrella". We claim that
02200
02300 this last kind of situation-action relationship should be encoded as rules for
02400
02500 the RS, but that the other types of implication should be stored declaratively
02600
02700 within the DS. Let's try to justify this distinction.
02800
02900
03000 The situation-action rules indicate imperatively how to behave in the world; the
03100
03200 other types of implication merely indicate expected relationships and tendencies
03300
03400 within the world. The rules of a RS are meant to indicate potential procedural
03500
03600 actions which are obeyed by the system, while the DSs indicate the way the world
03700
03800 (the RSs environment) behaves in terms of some model of it. The essential thing
03900
04000 to consider is what relations are to be @i[caused in time;] these are the things
04100
04200 we should cast as rules. The lhs of a rule measures some aspect of knowledge
04300
04400 presently in DSs, while the rhs of the rule defines the attention of the system
04500
04600 (regarded as a processor feeding off of the DS) in the immediate future.
04700
04800
04900 This is the heart of why rule-sets are algorithms. They are algorithms for
05000
05100 guiding the application of other (DS processing) algorithms. It also explains
00100 Lenat & Harris p. 34
00200
00300
00400 why other kinds of implications are unsuitable to be rules. Consider causal
00500
00600 implication ("Raining --> Wet"). While the lhs could be a rule's lhs (it
00700
00800 measures an aspect of any situation), the rhs should @i[not] be a rule's rhs (it
00900
01000 @r[8]
01100 does not indicate an appropriate action for the system to take).
01200
01300
01400 Most purist production systems have (often implicitly!) a rule of the form "If
01500
01600 the left side of an implication is true in the database, Then assert the right
01700
01800 side". This is only one kind of rule, of course, capable of dealing with
01900
02000 implications. For example, MYCIN and LT [17] (implicitly) follow a very
02100
02200 different rule: "If the rhs of an implication will satisfy my goal, Then the lhs
02300
02400 [%%30]
02500 of the implication is now the new goal". Other rules are possible; many
02600
02700 rules for reasoning may feed off the same "table" of world knowledge. The point
02800
02900 is that the implications themselves are declarative knowledge, not rules. In
03000
03100 summary, then, it may be very important to distinguish rules (attention guides)
03200
03300 from mere implications (access guides), and to store the latter within the DSs.
03400
03500 This policy was not motivated by the scientific inference task for our RS. We
03600
03700 believe it to be a worthwhile guideline in the design of @i[any] RS.
03800
03900
04000
04100
04200
04300
04400 ---------------
04500 8
04600 In a RS that aspires to any generality at all, an antecedent theorem of the
04700 form "if [you know that] it is raining, then [assert that] it is wet" is not the
04800 appropriate form to store this knowledge; it is too compiled a form, standing
04900 alone. If "told" (or given) a rule like this, a learning system should "parse"
05000 it as a familiar kind of deduction, file the residue of new information away as
05100 a conjectured tendency of wetness to follow rain, and start checking for
05200 exceptions. A sophisticated (and lucky) discovery RS might thereby develop the
05300 concept of "shelter".
00100 Lenat & Harris p. 35
00200
00300
00400 4.4. Interpreter
00500
00600
00700 [%%159]
00800 After a rule fires, the neo-classical interpretation policy (#9 in Figure
00900
01000 1) demands that @i[any] rule in the system can potentially be the next one
01100
01200 selected to fire. This is true regardless of the speed-up techniques used in any
01300
01400 particular implementation (say, by preprocessing the lhs's into a discrimination
01500
01600 net [22]). But consider RSs for scientific discovery tasks. Their task -- both
01700
01800 at the top level and frequently at lower levels -- is quite open-ended. If
01900
02000 twenty rules trigger as relevant to such an open-ended activity (e.g., gathering
02100
02200 [%%144]
02300 empirical data, inducing conjectures, etc.) then there is much motivation
02400
02500 for continuing to execute just these twenty rules for a while. They form an
02600
02700 @i[ad hoc] plausible search algorithm for the agenda item selected.
02800
02900
03000 A RS for discovery might reasonably be given a complex interpreter (rule-firing
03100
03200 policy). AM, for example, experimented with a two-pass interpreter: first, a
03300
03400 best-first, agenda-driven resource allocator and attention focusser selects the
03500
03600 job it finds most interesting; second, it locates the set of relevant rules
03700
03800 (typically about 30 to 40 rules) for the job, and begins executing them one
03900
04000 after another (in best-first order of specificity) until the resources allocated
04100
04200 in the first step run out [20]. The overall rating of the job which these rules
04300
04400 are to satisfy determines the amount of cpu time and list cells that may be used
04500
04600 up before the rules are interrupted and job is abandoned.
04700
04800
04900 For example, say the job were "Find examples of Primes". It's allotted 35 cpu
05000
05100 seconds and 300 list cells, due to its overall priority rating just before it
00100 Lenat & Harris p. 36
00200
00300
00400 was plucked from the agenda. Say, 24 rules are relevant. The first one quickly
00500
00600 finds that "2" and "3" are primes. Should the job halt right then? No, not if
00700
00800 the real reason for this job is to gather as much data as possible, data from
00900
01000 which conjectures will be suggested and tested. In that case, many of the other
01100
01200 23 rules should be fired as well. They will produce not only @i[additional]
01300
01400 examples, but perhaps other @i[types] of examples.
01500
01600
01700 The jobs on AM's agenda are really just mini-research questions which are
01800
01900 plausible to spend time investigating. Although phrased as specific requests,
02000
02100 each one is really a research proposal, a topic to concentrate upon. We found it
02200
02300 necessary to deviate from the simplest uniform interpreter for clarity (e.g., a
02400
02500 human can follow the first-pass (job selection) taken alone and can follow the
02600
02700 second-pass (job execution) by itself), for efficiency (knowing that all 24
02800
02900 rules are relevant, there is no need to find them 35 times), and for power
03000
03100 (applying qualitatively different kinds of rules yields various types of
03200
03300 examples). We claim this quality of open-endedness will recur in any RS whose
03400
03500 task is free concept exploration. This includes all scientific discovery but not
03600
03700 [%%150]
03800 all scientific inference.
03900
04000
04100 5. Speculations for a New Discovery System
04200
04300
04400 The spirit of this paper has been to give up straightforward simplicity in RSs
04500
04600 for clarity, efficiency, and power. Several examples have been cited, but we
04700
04800 speculate that there are further tradeoffs of this kind which are applicable to
04900
05000 RSs whose purpose is to make new discoveries.
00100 Lenat & Harris p. 37
00200
00300
00400 Often, there are several possible ways the designer may view the task of (and
00500
00600 subtasks of) the intended RS. We wish to add the notion of "proof" to AM, say.
00700
00800 Should we represent proof as a resolution search, as a process of criticism and
00900
01000 improvement [11] spiralling toward a solution, as a natural deduction cascade,
01100
01200 ...? Although any one of these task-views might perform respectably, we advocate
01300
01400 the incorporation of all of them, despite the concommitant costs of added
01500
01600 processing time, space, and interfacing. In fact, we wish never to exclude the
01700
01800 possibility of the system acquiring another task-view.
01900
02000
02100 We look for the development of further discovery tools in the form of domain-
02200
02300 independent meta-heuristics that synthesize heuristic rules, and in the form of
02400
02500 [%%1]
02600 abstract heuristic schemata that specialize into efficient rules for each
02700
02800 newly-discovered domain. These discovery tools are all part of "getting
02900
03000 familiar" with shallowly understood concepts, such as synthesized ones tend to
03100
03200 be initially. It may even be that symbolic analogy techniques exist, cutting
03300
03400 across the traditional boundaries of knowledge domains.
03500
03600
03700 We contemplate a system that keeps track of (and has methods with which it
03800
03900 attempts to improve) the design of its own DSs, its own control structure, and
04000
04100 perhaps even its own design constraints. Although working in (a collection of)
04200
04300 specific domains, this would be a general @i[symbol system discoverer], capable
04400
04500 of picking up and exploring formulations, testing them and improving them.
04600
04700
04800 5.1. A New Set of Design Constraints
04900
05000
05100 Below are 13 principles for designing a RS whose task is that of scientific
00100 Lenat & Harris p. 38
00200
00300
00400 [%%150]
00500 theory formation. They are the result of reconsidering the original
00600
00700 principles (Figure 1) in the light shed by work on AM. Most of the "pure"
00800
00900 principles we mentioned in Figure 1 are changed, and a few new ones have
01000
01100 emerged.
01200
01300
01400 [%%23]
01500 FIGURE 3: Scientific Discovery RS Architecture
01600
01700 @r[1. ]Principle of Several Appropriate Memories. For each type of
01800 knowledge which must be dealt with in its own way, a separate DS
01900 should be maintained. The precise nature of each DS should be
02000 chosen so as to facilitate the access (read/write) operations
02100 which will be most commonly requested of it.
02200
02300 @r[2. ]Principle of Maximal DS Accesses. The set of primitive DS
02400 access operations (i.e., the read tests which a rule's lhs may
02500 perform, and the write actions which a rhs may call for) are
02600 chosen to include the largest packages (clusters, chunks,...) of
02700 activity which are commonly needed and which can be performed
02800 efficiently on the DS.
02900
03000 @r[3. ]Principle of Facetted DS Elements. For ever-growing data
03100 structures, there is much to be gained and little lost by
03200 permitting parts of one DS item to point to other DS items. In
03300 particular, schematic techniques of representing content by
03400 structure are now possible.
03500
03600 @r[4. ]Principle of Rules as Data. The view which the RS designer
03700 takes of the system's task may require that some rules be capable
03800 of reasoning about the rules in the RS (adding new ones, deleting
03900 old ones, keeping track of rules' performance, modifying existing
04000 rules,...). Some of the methods the RS uses to deal with
04100 scientific knowledge may be applicable to dealing with rules as
04200 well. In such cases, the system's rules may thus be naturally
04300 represented as new entries in the existing DS which holds the
04400 scientific theory.
04500
04600 @r[5. ]Principle of Regularities Among Rules. Each rule is actually
04700 [%%64]
04800 a rule @u[schema]. Sophisticated processing may be needed
04900 both to determine which instance(s) are relevant and to find the
05000 precise sequence of actions to be executed. Such schemata are
05100 often quite elaborate.
00100 Lenat & Harris p. 39
00200
00300
00400 @r[6. ]Principle of Avoiding Meaninglessly-Coupled Rules. Passing
00500 special-purpose loop control notes back and forth is contrary to
00600 both the spirit of pure RSs and to efficiency. If rules are to
00700 behave as coupled, the least we demand is that the notes they
00800 write and read for each other be meaningful entries in DS (any
00900 other rule may interpret the same note, and other rules might have
01000 written one identical to it).
01100
01200 @r[7. ]Principle of Controlled Environment. For many tasks, it is
01300 detrimental to permit external stimuli (from an environment) to
01400 enter any DS at random. At the least, the RS should be able to
01500 distinguish these alien inputs from internally-generated DS
01600 entries.
01700
01800 @r[8. ]Principle of Tacit Knowledge. In designing the DS, much
01900 knowledge may be stored @u[implicitly]; e.g., by where facts are
02000 placed in a hierarchical network. The DS should be designed so as
02100 to maximize this kind of concentrated, analogical information
02200 storage. Hence, hard-working access functions are needed to
02300 encode and decode the full meaning of DSs.
02400
02500 @r[9. ]Principle of Named Algorithms. When basic, "how to"
02600 knowledge is available, it should be packaged as an operation and
02700 used as a part of the lhs or rhs of various rules. Embodying this
02800 chunk of knowledge as several coupled rules is not recommended,
02900 for we will want to manipulate and utilize this knowledge as a
03000 whole.
03100
03200 @r[10. ]Principle of Rules as Attention Guides. Knowledge should be
03300 encoded as rules when it is intended to serve as a guide of the
03400 system's attention; to direct its behavior. Other kinds of
03500 information, even if stated in conditional form, should be
03600 relegated to DSs (either explicitly as entries, or implicitly as
03700 special access functions).
03800
03900 [%%77]
04000 @r[11. ]Principle of Inertial Interpreter. In tasks like
04100 scientific research, where relevant rules will be performing
04200 inherently open-ended activities (e.g., data-gathering), such
04300 rules should be allowed to continue for a while even after they
04400 have nominally carried out the activity (e.g., gathered one piece
04500 of data). In such cases, the occasional wasted time and space is
04600 more than compensated for by the frequent acquisition of valuable
04700 knowledge that was concentrated in the later rules. For
04800 scientific discovery, no single rule (however "appropriate")
04900 should be taken as sufficient: a single rule must necessarily view
05000 the task in just one particular way. All views of the task have
05100 [%%221]
05200 something to contribute; hence variety depends on a policy
05300 of always applying several rules.
00100 Lenat & Harris p. 40
00200
00300
00400 @r[12. ]Principle of Openness. A discovery rule system can be
00500 enriched by incorporating into its design several independent
00600 views of the knowledge it handles. Never assume everything is
00700 known about a class of knowledge. All appropriate formulations of
00800 a knowledge class have something to contribute; hence variety
00900 depends on openness to new formulations.
01000
01100 @r[13. ]Principle of Support of Discovery by Design. By
01200 representing its own design explicitly (say, as concepts), the RS
01300 could study and improve those concepts, thereby improving itself.
01400 @r[9]
01500 This includes the DS design , the access function algorithms,
01600 how to couple them, the function of various rules, the
01700 interpretation policy of the RS, etc. This suggests that the
01800 study of designs of computational mechanisms may be a worthy area
01900 for a discovery system to pursue, whether its own design is
02000 available to it or not.
02100
02200
02300
02400 Rule systems whose designs adhere to these guidelines will be large, elaborate,
02500
02600 and non-classical. We have mentioned throughout the paper several new
02700
02800 complications which the principles introduce. Trying to produce such a RS for a
02900
03000 task for which a pure, neo-classical production rule system was appropriate will
03100
03200 probably result in disaster. Nevertheless, empirical evidence suggests that RSs
03300
03400 having this architecture are quite natural -- and relatively tractable to
03500
03600 construct -- for open-ended tasks like scientific discovery.
03700
03800
03900
04000 @i[ACKNOWLEDGEMENT]
04100
04200 This research builds upon Lenat's Ph.D. thesis at Stanford University, and he
04300 wishes to deeply thank his advisers and committee members: Bruce Buchanan,
04400 Edward Feigenbaum, Cordell Green, Donald Knuth, and Allen Newell. In addition,
04500 he gladly acknowledges the ideas he received in discussions with Dan Bobrow,
04600 Avra Cohn, and Randy Davis. Similarly, ideas received by Harris in two long and
04700 fruitful associations, with John Seely Brown and with Roger Schank, have
04800 contributed to this work. Many of our ideas have evolved through discussions at
04900
05000 ---------------
05100 9
05200 e.g., the facet specifications. If the input/output requirements change with
05300 time, so should the rule system's data structures.
00100 Lenat & Harris p. 41
00200
00300
00400 CMU this past year, notably with Don Cohen, John McDermott, Kamesh Ramakrishna,
00500 Paul Rosenbloom, James Saxe, and especially Herbert Simon.
00600
00700
00800
00900 @i[REFERENCES]
01000
01100 [0] Bledsoe, W. W., and Tyson, M., @i[The UT Interactive Prover], University
01200 of Texas at Austin, Depts. of Mathematics and Computer Sciences Automatic
01300 Theorem Proving Project Report No. 17, May, 1975.
01400 [1] Bobrow, D., "Natural Language Input for a Computer Problem Solving
01500 System", in (Minsky, M., editor), @i[Semantic Information Processing], The
01600 MIT Press, Cambridge, Massachusetts, 1968.
01700 [2] Bobrow, D., and Winograd, T., @i[An Overview of KRL, A Knowledge
01800 Representation Language], Journal of Cognitive Science, Vol 1, No 1,
01900 January 1977.
02000 [3] Bobrow, R., and Brown, J. S., "Systematic Understanding, in (Bobrow, D.,
02100 and Collins, A., eds.), @i[Representation and Understanding], Academic
02200 Press, S.F., 1975.
02300 [4] Buchanan, Bruce G., G. Sutherland, and E. Feigenbaum, @i[Heuristic
02400 Dendral: A Program for Generating Explanatory Hypotheses in Organic
02500 Chemistry], in (Meltzer and Michie, eds.) Machine Intelligence 4, American
02600 Elsevier Pub., N.Y., 1969, pp. 209-254.
02700 [5] Buchanan, Bruce G., @i[Applications of Artificial Intelligence to
02800 Scientific Reasoning], Second USA-Japan Computer Conference, Tokyo, August
02900 26-28. Published by AFIPS and IPSJ, Tokyo, 1975, pp. 189-194.
03000 [6] Davis, Randall, @i[Applications of Meta Level Knowledge to the
03100 Construction, Maintenance, and Use of Large Knowledge Bases], SAIL AIM-
03200 271, Artificial Intelligence Laboratory, Stanford University, July, 1976.
03300 [7] Davis, R., and King, J., @i[An overview of production systems], Report
03400 STAN-CS-75-524, Memo AIM-271, Stanford U. CS Department, 1975.
03500 [8] Dijkstra, E. W., "Notes on Structured Programming", in Dahl, Dijkstra, and
03600 Hoare, @i[Structured Programming], Academic Press, London, 1972, pp. 1-82.
03700 [9] Hayes-Roth, Frederick, and Victor R. Lesser, @i[Focus of Attention in a
03800 Distributed Speech Understanding System], Computer Science Dept. Memo ,
03900 Carnegie Mellon University, Pittsburgh, Pa., January 12, 1976.
04000 [10] Hewitt, Carl, @i[Viewing Control Structures as Patterns of Passing
04100 Messages], MIT AI Lab Working Paper 92, April, 1976.
04200 [11] Lakatos, Imre, @i[Proofs and Refutations], Cambridge U. Press, 1976.
04300 [12] Lenat, D., @i[BEINGs: Knowledge as Interacting Experts], 4th IJCAI,
04400 Tbilisi, Georgian SSR, USSR, 1975.
04500 [13] Lenat, D., @i[AM: An Artificial Intelligence Approach to Discovery in
04600 Mathematics as Heuristic Search], SAIL AIM-286, Artificial Intelligence
04700 Laboratory, Stanford University, July, 1976. Jointly issued as Computer
04800 Science Dept. Report No. STAN-CS-76-570.
04900 [14] McCracken, Don, @i[A Parallel Production System Architecture for Speech
05000 Understanding], CMU CS Dept. Ph.D. Thesis, 1977.
05100 [15] Minsky, Marvin, "A Framework for Representing Knowledge", in (Winston,
05200 P., ed.), @i[The Psychology of Computer Vision], McGraw Hill, N.Y. 1975,
05300 pp. 211-277.
00100 Lenat & Harris p. 42
00200
00300
00400 [16] Moran, T.P., @i[The symbolic imagery hypothesis: An empirical
00500 investigation via a production system simulation of human behavior in a
00600 visualization task], CMU CS Dept. Thesis, 1973. See also 3IJCAI, pp. 472-
00700 477.
00800 [17] Newell, Allen, J. Shaw, and H. Simon, @i[Empirical Explorations of the
00900 Logic Theory Machine: A Case Study in Heuristics], RAND Corp. Report P-
01000 951, March, 1957.
01100 [18] Newell, Allen, and Simon, Herbert, @i[Human Problem Solving], Prentice-
01200 Hall, Englewood Cliffs, New Jersey, 1972.
01300 [19] Newell, A., @i[Production Systems: Models of Control Structures], May,
01400 1973 CMU Report, also published in (W.G. Chase, ed.) @i[Visual
01500 Information Processing], NY: Academic Press, Chapter 10, pp. 463-526.
01600 [20] Norman, D., and D. Bobrow, @i[On Data-limited and Resource-limited
01700 Processes], Journal of Cognitive Psychology, Volume 7, 1975, pp. 44-64.
01800 [21] Polya, George, @i[Mathematics and Plausible Reasoning], Princeton
01900 University Press, Princeton, Vol. 1, 1954; Vol. 2, 1954.
02000 [22] Rychener, M. D., @i[Production systems as a programming language for
02100 artificial intelligence applications]. Pittsburgh, Pa: Carnegie-Mellon
02200 University, Department of Computer Science. 1976.
02300 [23] Shortliffe, E. H., @i[MYCIN -- A rule-based computer program for advising
02400 physicians regarding antimicrobial therapy selection], Stanford AI Memo
02500 251, October, 1974.
02600 [24] Waterman, D. A., @i[Adaptive Production Systems], CIP Working Paper 285,
02700 CMU Psychology Dept., 1974. See also 4IJCAI, pp. 296-303.
02800 [25] Waterman, D. A., @i[Generalization Learning Techniques for Automating the
02900 Learning of Heuristics], AI Journal (forthcoming)